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Abstract — This work tackles the problem of energy-efficient 
distributed power control in wireless networks with a large num- 
ber of transmitters. The problem is modeled by a dynamic game. 
Each transmitter-receiver communication is characterized by a 
state given by the available energy and/or the individual channel 
state and whose evolution is governed by certain dynamics. Since 
equilibrium analysis in such a (stochastic) game is generally 
difficult and even impossible, the problem is approximated by 
exploiting the large system assumption. Under an appropriate 
exchangeability assumption, the corresponding mean field game 
is well defined and studied in detail for special cases. The 
main contribution of this work is to show how mean field 
games can be applied to the problem under investigation and 
provide illustrative numerical results. Our results indicate that 
this approach can lead to significant gains in terms of energy- 
efficiency at the resulting equilibrium. 

I. Introduction 

We study distributed wireless networks, in which mobile 
terminals have the liberty to choose their own power policies. 
This can be due to the absence of a central node to control 
the terminals in the network or due to complexity issues when 
there is a large number of terminals. Such a scenario can 
occur with ad hoc networks [7|, unlicensed band communi- 
cations, and cognitive radio (5), ifTSl . Modeling the terminals 
as rational agents, who choose their power control policies 
to maximize some utilities leads to applying game theory 
on the problem J8), iflOl . In the problem under study, the 
wireless network model is a multiple access channel (MAC) 
and the channel access method is code division multiple 
access (CDMA). In this network, the mobile terminals aim 
to maximize their expected energy-efficiencies over a given 
time duration. For this purpose, they must adapt their power 
control policies to varying channel conditions and decreasing 
energy in their batteries. 

By energy-efficiency, we mean the number of successfully 
decoded bits at the receiver per Joule consumed at the trans- 
mitter, as defined in J6j. In this seminal paper and related 
works nU, 0, 0, H, ICQ, the power control problem is 
modeled by a sequence of static games independent from 
one stage to another. But this approach does not capture the 
interactions that are present among the players when a game 
is repeated. In fl2l . it is shown that modeling the problem 
by a repeated game can lead to more efficient equilibrium 
power control policies (in the sense of Pareto) than the 
Nash equilibrium from the static formulation. However, this 
repeated model uses a normalized stage game which does not 
depend on the channels realizations. One of the motivations 



of our work is to account for the impact of channels real- 
izations by modeling the problem with a stochastic game. 
Moreover, we also account for long-term energy constraint 
in the terminal since the remaining energy of the battery 
decreases when power is consumed. Precisely, the energy- 
efficient power control problem in MAC under long-term 
energy constraints is modeled by a stochastic differential game 
(SDG). But the problem of characterizing the performance 
of distributed networks modeled by SDG becomes hard and 
even impossible when the number of players becomes large. 
The same statement holds for determining individually optimal 
control strategies. In [13], this problem is overcome by the use 
of mean field games (MFG). MFG ifTTTl represent a way of 
approximating a stochastic differential (or difference) game, 
by a much more tractable model. Under the assumption of 
individual state information, the idea is precisely to exploit as 
an opportunity the fact that the number of players is large to 
simplify the analysis. Typically, instead of depending on the 
actions and states of all the players, the mean field utility of a 
player only depends on his own action and state, and depends 
on the others through an mean field. The main contribution 
of this paper is to show that the model developed in |[T3"1 
can be particularized to cases in which, it is possible to 
derive equilibrium power control policies. These policies are 
illustrated with some numerical results and they are compared 
with the equilibrium from the static game formulation and the 
equilibrium of the repeated discounted game studied in \12\. 

Our paper is structured as follows. We describe the studied 
wireless network and the evolution laws for the channels and 
the energy in Sec. [n] In Sec. HID the problem of power 
control is modeled as an SDG and is shown to converge to 
an MFG under given conditions. Additionally, the regime of 
large energy budgets and the regime of quasi-static channels 
are discussed. Resulting equilibrium power control strategies 
are compared to the classical power control policies from the 
static game and the discounted repeated game in Sec. [IV] 

Notations; In the following, Vx/ and A^/ respectively 
represent the gradient and the Laplacian of the function / 
w.r.t. the vector x. The divergence operator w.r.t. the vector 
x is denoted by div x . The scalar product in the space A is 
represented by ( , )a- 

II. System Model 

We consider one cell with a single base station and K 
mobile terminals. Since the uplink power control problem 
is addressed, mobile terminals are transmitters and the base 



station is the receiver. The radio resource is used as a MAC. 
Consequently the transmitters interfere with each other. Each 
transmitter sends a signal to a common receiver and has to 
choose the power level of the transmitted signal. In order to 
optimize its individual energy-efficiency, i.e., the ratio of its 
throughput to its transmit power, each transmitter adapts its 
power level. This choice depends on the quality of the channel 
between the transmitter and the receiver, the power levels 
chosen by the other transmitter and the energy remaining. The 
set of transmitters is denoted by K, = {1, . . . , K} and for each 
transmitter k £ fC, the signal-to-interference plus noise ratio 
(SINR) writes 
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where p k £ Vu and \hk\ are the power level and the channel 
gain of transmitter fc, respectively. The variance of the noise 
is represented by a 2 . The interference term is denoted by 

h = jfJ2jeK,j^kPj\ h j\ 2 with N a processing gain due 
to interference management at the receiver. For example, in 
CDMA systems, N represents the spreading factor. 

The instantaneous energy-efficiency, in bit/joule is defined 

as 
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where R is a constant rate of the transmitter. The function / 
is the probability of having no outage which takes its values 
in [0, 1] and depends on the SINR. 

Two parameters define the state of each transmitter: the 
channel coefficients to the receiver and the remaining energy 
in its battery. The evolution of these parameters is modeled 
by the two following laws. The channel evolution law is 



(3) 



where fj, € C and r\ £ R are constants related to the 
channel statistics and W k . (t) are mutually independent Wiener 
processes. Depending on the value of r\, this law can model 
slow-fading or fast-fading. Additional properties about the 
asymptotic behavior of the channels are given in lf]~3l . 

The evolution law of the remaining energy in the battery is 
given by 

dE k (t) = - Pk (t)dt. (4) 

This means that the energy of the battery decreases with the 
transmit power consumption. 

III. From the Stochastic Differential Game 
Formulation to the Mean Field Game 

A. Stochastic Differential Game 

With the model defined in the previous section, the problem 
of each transmitter maximizing its expected energy-efficiency 
over a given time interval can be addressed with an SDG Time 
is assumed to be continuous, i.e., t € R. The time horizon of 
the game is finite, it is the interval ranging from T to T' . 



Definition 1 (SDG model of the power control problem): 
The stochastic differential power control game is defined by 
the 5— tuple 

Q = (X, {V k }keK, {XkjkeK, {<Sk}keic, {£4}feeJc) where: 

• K. = {1, . . . , K} is the set of players. Here, the players 
correspond to the transmitters. 

> Vk is the set of actions of player fc £ K. Here, the action 
set corresponds to the interval of transmit power values. 

• Xk is the state space of player k € K.. The game state 
for player fc 6 /C at time t is defined by X_ k (t) = 
[E k (t),h k (t)]T. 

• Sk is the set of feedback control policies for player fc 6 
/C. A control policy will be denoted by Pk(T — > T') 
which is a function of time between T and T' two reals 
such that V > T; 

• the average utility function U k is defined by: 



U k (p(T^T')) =E 



Uk(p(t),X(t))dt + q{X(T')) 



(5) 

where p(T T') = ( Pl {T T"), . . . ,p K (T T')) is 
the control strategy profile, X_{t) — [2L.i(t), ■ ■ ■ , X K (i)] 
is the state profile, q(X_(T')) is the utility at the final 
state, and u k is the instantaneous utility. 
Even if it can be proven that a Nash equilibrium exists in this 
game under given conditions lfl3l . obtaining the expression 
of an equilibrium requires to solve a system of 2K coupled 
equations. Consequently, the complexity of such a system 
makes its resolution impossible for K large. This makes us 
consider the MFG associated with the problem to overcome 
this complexity issue. 

B. The Mean field game analysis (K — > +00) 

1) Assumptions: When the number of players goes to 
infinity, under the assumption of the exchangeability of the 
players of the game and the convergence of the interference 
term, the SDG can be proven to converge to an MFG. The 
exchangeability property (see [13] for more details) is ensured 
if each player only knows its individual state and implements 
an homogeneous admissible control: Pk{t) — a(t, X_ k (t)). A 
sufficient condition for the convergence of the interference 
term is limx. jv^oo 77 = 9 > 0. In this new game, the set 
of players is continuous and the generic state of a player is 
given by s(£) = [E(t), h(t)] T , whose distribution is given by 
m t (jn t is the mean field). The SINR can be rewritten as: 

p(t)\Kt)\ 2 
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with I(t, mt) the interference resulting from the continuum of 
other players 



I(t,m t ) = / \h\ 2 a(t,s)m t (ds), 



(7) 



where a(t,s) denotes the generic power response at time t 
and state s. The instantaneous utility writes 

u{p{t),s(t),m t ) = ^77^ • (8) 



P(t) 



The main advantage of the MFG formulation is that the 
utility of each player depends only on its own state s(t) and 
a common mean field m t . 

2) Solution to the mean field best-response problem: For 
the mean field optimal trajectory m£ , the best-response of the 
generic player is such that there exists an average utility: 



Vt = sup J 



u(p(t),s(t),m* t )dt + q(s(T')) 



(9) 



Conversely, the control policy resulting from (|9) has to lead 
to the distribution trajectory m* t . Consequently, a solution of 
the mean field response problem is a solution of the system 
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with vt> = q(s(T')), m-r known and 

H(s, u' , m) — sup{u(p, s, m) 
v 



p.u'}. 
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The first equation is a Hamilton-Jacobi-Bellman-Fleming 
equation which gives the behavior of v t for a given m t . It 
is coupled with a Fokker-Planck-Kolmogorov equation which 
gives the behavior of m t for a given v t . The former one 
is a backward equation whereas the latter one is a forward 
equation. It means that an initial condition for mt and a 
final condition for vt are needed to solve the system of two 
equations. 

C. Two particular regimes of the mean field game 

Solving (fTOt in the general case is complicated since the 
state of a transmitter includes both its energy and its channel 
coefficients. However, it is possible to study the solutions of 
the mean field response problem for these two parts separately. 

1) Large energy budgets: If energy budgets are large 
enough, the variation of energy during the game can be 
neglected. Consequently, only channel coefficients can be 
considered as the state of the transmitter. In this case, the 
mean field problem reduces to 



^ + sup p u(p, h,m t ) = 0, 
div h (m t h(p - h)) = 



dt 



-A h m t . 



(12) 



The first equation amounts to choosing the Nash equilibrium 
from [6 1 as the power control. The second equation only 
depends on the channels statistics and give the evolution of the 
distribution of the channels. The second equation is solved first 
and the solution m t is inserted in the first equation to obtain 
the power control. 



2) Quasi-static channels: Considering only energy as the 
state of a transmitter amounts to assume that the channel 
coefficients are constant during the time interval [T, T'\ . In 
this case, the mean field problem turns into 



^+sup p {u(p,m t )-pf§} = 0, 



dt 

Sim. a_( m . v *\ = n 



(13) 



withp* = argsup p {w(p, mt)— The first equation gives 

Vt and p*(T — > T ) given m t . The second equation gives m* 
given p* (T — > T'). Equilibrium power control policy resulting 
from this case is illustrated in the following section. 

IV. Numerical Results for a quasi-static channels 

In this section, we provide illustrative numerical results 
for the particular case of quasi-static channels, i.e., when the 
energy dynamics are faster than the channel dynamics. For 
the implementation of the proposed scheme, each terminal 
requires only the knowledge of its own channel state and 
energy level to choose the transmit power. 

A. Comparison with other NE 

For the purpose of evaluating our results, we compare the 
equilibrium of the MFG to the equilibrium of two other well 
known games. 

1) The static Nash: This is the classical NE from the work 
of [6 |. The equilibrium point is given by the equation: 



Vfc e AC, p* k 



(14) 



\h k \ 2 1-9/3* 
where /3* is the unique solution of the equation 

xf'(x)-f(x)=0. (15) 

2) The repeated game "operating point": When the power 
control game is treated as a discounted repeated game, 
there are several NE. In [12], the authors propose an 
equilibrium point, defined as the "operating point" which 
can be very close to the global optimal point. This 
equilibrium is given by the equation: 



Vfc e AC, pk 



\h k \ 2 1 
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where j3 is the unique solution of the equation 

x(l-0x)f'(x)-f(x) = O. 



(16) 



(17) 



B. Parameters used 

For the purpose of simulations, in order to obtain useful 
numerical results we take the following parameters: 

1) the rate R = l Mbps, 

2) the noise level with path loss, a 2 = 0.1 W, 

3) the channel gain E\h{t)\ 2 = 1, 

4) the success function f(-f) = exp(— — ) J2). 

The averaged channel gain for all players at all times is taken 
to be a constant E|/i(<)| 2 = 1, The initial distribution in 
energy of the terminals mj is specified for each figure. The 
final condition vt 1 = <l( s (T')) is set to 0, in order for the 
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TABLE I 

Simulation parameters for the presented figures. 



comparison to be fair with equilibrium power policies from the 
static game and the repeated discounted game. Indeed, there 
is no final reward in these two games. We take the maximum 
available energy to be E max = 20 J and time duration T" — T 
is specified for each figure. 

C. Discussion 

On Fig. Q] the equilibrium power policy plot shows that 
terminals starting the game with high energy level (20 J in the 
figure) start transmitting with a high power level. Then this 
power decreases with time. It can also be noted that terminals 
starting the game with low energy do not transmit at high 
power values at the beginning of the game. They first wait for 
other terminals to empty their battery in order to suffer from 
less interference. This phenomenon is highlighted on Fig. [2] 
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Fig. 1. MFG equilibrium power policy w.r.t. time and energy. 
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Fig. 3. Evolution of the energy distribution with the equilibrium power policy 
of the MFG for uniform initial distribution. 

On Fig. [3] where terminals start the game with a uniform 
energy distribution, it can be seen that the amount of terminals 
with high energy (20 J) decreases with time, whereas an 
increasing proportion of terminals have empty batteries. A 
similar behavior is captured on Fig. |4] except that there are 
only terminals with high energy at the beginning of this case. 
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Fig. 4. Evolution of the energy distribution with the equilibrium power policy 
of the MFG for non-uniform initial distribution. 
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Fig. 2. MFG equilibrium power policies w.r.t. time for 3 initial amounts of Fig. 5. MFG equilibrium power policies w.r.t. time for 3 initial amounts of 
energy, for a short game duration. energy, for a long game duration. 




A numerical solution to the general case of an evolving 
channel as well as the energy constraint, is a possibility for 
future work. Additional possibilities include considering a 
multi-cellular network, distributed base stations and the multi- 
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Fig. 6. Comparison of the average energy-efficient utilities with the MFG 
equilibrium power policy, for a long game duration, against the equilibrium 
from the static game, and the operating point from the repeated game 
formulation defined earlier. 



Since the solution to the mean field response problem is 
determined for a given time duration, the equilibrium power 
control policy naturally depends on this duration. Hence, it 
is interesting to increase the time duration, as it is done on 
Fig. [5] to capture the impact on the power policy. Globally, 
it entices the terminals to consume less power to make their 
batteries last longer. This low consumption gives the terminals 
a better energy-efficiency than with the equilibrium form the 
static game or the equilibrium from the discounted repeated 
game, as illustrated on Fig. [6] 

V. Conclusion 

In this work, we treat the problem of power control games 
as a mean field game, taking into account the limited energy 
available to mobile transmitters, and the effects of channel 
fading. Provided that convergence conditions are met, the 
mean field approach offers interesting and useful results. While 
in the general case, the resulting partial differential equations 
are hard to solve, we analyze some special cases where 
numerical results can be obtained. The numerical results are 
encouraging, but they have only been computed for the case 
of quasi-static channels. 

Some of our key results that can be interesting to engineers 
and other researchers are presented below: 

1 ) Terminals with a low starting energy transmit with lower 
power at the start and slowly increase their power with 
time, when the energy distribution is homogeneous and 
T" — T small. 

2) In the same case, terminals with a high starting energy 
start with a high power and slowly decrease their power 
with time. 

3) Terminals with a low starting energy transmit with lower 
power at the start and slowly decrease their power with 
time and then raise, when the energy distribution is 
homogeneous and T' — T large. 

4) The equilibrium of the mean field game when T' — T 
is large, outperforms other known equilibrium (from [6 
and fPUl ) in terms of energy-efficiency. 



_ carrier case. 
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